Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 4250 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.7 MiB |
| Average record size in memory | 426.5 B |
Variable types
| NUM | 15 |
|---|---|
| BOOL | 3 |
| CAT | 2 |
Reproduction
| Analysis started | 2021-11-04 18:40:12.041815 |
|---|---|
| Analysis finished | 2021-11-04 18:40:56.725805 |
| Duration | 44.68 seconds |
| Version | pandas-profiling v2.7.1 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
state has a high cardinality: 51 distinct values | High cardinality |
total_day_charge is highly correlated with total_day_minutes | High correlation |
total_day_minutes is highly correlated with total_day_charge | High correlation |
total_eve_charge is highly correlated with total_eve_minutes | High correlation |
total_eve_minutes is highly correlated with total_eve_charge | High correlation |
total_night_charge is highly correlated with total_night_minutes | High correlation |
total_night_minutes is highly correlated with total_night_charge | High correlation |
total_intl_charge is highly correlated with total_intl_minutes | High correlation |
total_intl_minutes is highly correlated with total_intl_charge | High correlation |
number_vmail_messages has 3139 (73.9%) zeros | Zeros |
number_customer_service_calls has 886 (20.8%) zeros | Zeros |
| Distinct count | 51 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 33.3 KiB |
| WV | 139 |
|---|---|
| MN | 108 |
| ID | 106 |
| AL | 101 |
| VA | 100 |
| Other values (46) |
| Value | Count | Frequency (%) | |
| WV | 139 | 3.3% | |
| MN | 108 | 2.5% | |
| ID | 106 | 2.5% | |
| AL | 101 | 2.4% | |
| VA | 100 | 2.4% | |
| OR | 99 | 2.3% | |
| TX | 98 | 2.3% | |
| UT | 97 | 2.3% | |
| NY | 96 | 2.3% | |
| NJ | 96 | 2.3% | |
| Other values (41) | 3210 | 75.5% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 24 | 100.0% |
account_length
Real number (ℝ≥0)
| Distinct count | 215 |
|---|---|
| Unique (%) | 5.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100.23623529411765 |
|---|---|
| Minimum | 1 |
| Maximum | 243 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 35.45 |
| Q1 | 73 |
| median | 100 |
| Q3 | 127 |
| 95-th percentile | 167 |
| Maximum | 243 |
| Range | 242 |
| Interquartile range (IQR) | 54 |
Descriptive statistics
| Standard deviation | 39.69840057 |
|---|---|
| Coefficient of variation (CV) | 0.3960483996 |
| Kurtosis | -0.1321747749 |
| Mean | 100.2362353 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.1223273244 |
| Sum | 426004 |
| Variance | 1575.963008 |
| Value | Count | Frequency (%) | |
| 90 | 53 | 1.2% | |
| 87 | 51 | 1.2% | |
| 93 | 50 | 1.2% | |
| 100 | 48 | 1.1% | |
| 120 | 48 | 1.1% | |
| 105 | 48 | 1.1% | |
| 116 | 47 | 1.1% | |
| 98 | 47 | 1.1% | |
| 127 | 47 | 1.1% | |
| 112 | 46 | 1.1% | |
| Other values (205) | 3765 | 88.6% |
| Value | Count | Frequency (%) | |
| 1 | 7 | 0.2% | |
| 2 | 2 | < 0.1% | |
| 3 | 7 | 0.2% | |
| 4 | 2 | < 0.1% | |
| 5 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 243 | 1 | < 0.1% | |
| 232 | 2 | < 0.1% | |
| 225 | 2 | < 0.1% | |
| 224 | 2 | < 0.1% | |
| 222 | 2 | < 0.1% |
area_code
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 33.3 KiB |
| area_code_415 | |
|---|---|
| area_code_408 | |
| area_code_510 |
| Value | Count | Frequency (%) | |
| area_code_415 | 2108 | 49.6% | |
| area_code_408 | 1086 | 25.6% | |
| area_code_510 | 1056 | 24.8% |
Length
| Max length | 13 |
|---|---|
| Mean length | 13 |
| Min length | 13 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 6 | 50.0% | |
| Decimal_Number | 5 | 41.7% | |
| Connector_Punctuation | 1 | 8.3% |
| Value | Count | Frequency (%) | |
| Latin | 6 | 50.0% | |
| Common | 6 | 50.0% |
| Value | Count | Frequency (%) | |
| ASCII | 12 | 100.0% |
international_plan
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 33.3 KiB |
| no | |
|---|---|
| yes | 396 |
| Value | Count | Frequency (%) | |
| no | 3854 | 90.7% | |
| yes | 396 | 9.3% |
voice_mail_plan
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 33.3 KiB |
| no | |
|---|---|
| yes |
| Value | Count | Frequency (%) | |
| no | 3138 | 73.8% | |
| yes | 1112 | 26.2% |
| Distinct count | 46 |
|---|---|
| Unique (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.631764705882353 |
|---|---|
| Minimum | 0 |
| Maximum | 52 |
| Zeros | 3139 |
| Zeros (%) | 73.9% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 16 |
| 95-th percentile | 36 |
| Maximum | 52 |
| Range | 52 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 13.4398822 |
|---|---|
| Coefficient of variation (CV) | 1.761045147 |
| Kurtosis | 0.2730383375 |
| Mean | 7.631764706 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.373091038 |
| Sum | 32435 |
| Variance | 180.6304335 |
| Value | Count | Frequency (%) | |
| 0 | 3139 | 73.9% | |
| 31 | 69 | 1.6% | |
| 28 | 58 | 1.4% | |
| 29 | 57 | 1.3% | |
| 24 | 57 | 1.3% | |
| 33 | 55 | 1.3% | |
| 27 | 54 | 1.3% | |
| 26 | 53 | 1.2% | |
| 32 | 47 | 1.1% | |
| 30 | 47 | 1.1% | |
| Other values (36) | 614 | 14.4% |
| Value | Count | Frequency (%) | |
| 0 | 3139 | 73.9% | |
| 4 | 1 | < 0.1% | |
| 6 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 10 | 4 | 0.1% |
| Value | Count | Frequency (%) | |
| 52 | 1 | < 0.1% | |
| 50 | 2 | < 0.1% | |
| 49 | 3 | 0.1% | |
| 48 | 4 | 0.1% | |
| 47 | 4 | 0.1% |
| Distinct count | 1843 |
|---|---|
| Unique (%) | 43.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.2596 |
|---|---|
| Minimum | 0.0 |
| Maximum | 351.5 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 91.59 |
| Q1 | 143.325 |
| median | 180.45 |
| Q3 | 216.2 |
| 95-th percentile | 271.055 |
| Maximum | 351.5 |
| Range | 351.5 |
| Interquartile range (IQR) | 72.875 |
Descriptive statistics
| Standard deviation | 54.01237333 |
|---|---|
| Coefficient of variation (CV) | 0.2996365982 |
| Kurtosis | -0.05670971637 |
| Mean | 180.2596 |
| Median Absolute Deviation (MAD) | 36.6 |
| Skewness | -0.006910229801 |
| Sum | 766103.3 |
| Variance | 2917.336473 |
| Value | Count | Frequency (%) | |
| 189.3 | 10 | 0.2% | |
| 180 | 9 | 0.2% | |
| 184.5 | 8 | 0.2% | |
| 154 | 8 | 0.2% | |
| 177.1 | 8 | 0.2% | |
| 168.6 | 7 | 0.2% | |
| 230.7 | 7 | 0.2% | |
| 183.6 | 7 | 0.2% | |
| 197 | 7 | 0.2% | |
| 185 | 7 | 0.2% | |
| Other values (1833) | 4172 | 98.2% |
| Value | Count | Frequency (%) | |
| 0 | 2 | < 0.1% | |
| 2.6 | 1 | < 0.1% | |
| 6.6 | 1 | < 0.1% | |
| 7.2 | 1 | < 0.1% | |
| 7.8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 351.5 | 1 | < 0.1% | |
| 346.8 | 1 | < 0.1% | |
| 345.3 | 1 | < 0.1% | |
| 338.4 | 1 | < 0.1% | |
| 337.4 | 1 | < 0.1% |
total_day_calls
Real number (ℝ≥0)
| Distinct count | 120 |
|---|---|
| Unique (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 99.90729411764706 |
|---|---|
| Minimum | 0 |
| Maximum | 165 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 67 |
| Q1 | 87 |
| median | 100 |
| Q3 | 113 |
| 95-th percentile | 133 |
| Maximum | 165 |
| Range | 165 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 19.85081731 |
|---|---|
| Coefficient of variation (CV) | 0.1986923726 |
| Kurtosis | 0.1935936484 |
| Mean | 99.90729412 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -0.08581246337 |
| Sum | 424606 |
| Variance | 394.054948 |
| Value | Count | Frequency (%) | |
| 105 | 101 | 2.4% | |
| 95 | 97 | 2.3% | |
| 110 | 92 | 2.2% | |
| 94 | 92 | 2.2% | |
| 112 | 90 | 2.1% | |
| 102 | 89 | 2.1% | |
| 97 | 88 | 2.1% | |
| 107 | 87 | 2.0% | |
| 100 | 85 | 2.0% | |
| 108 | 84 | 2.0% | |
| Other values (110) | 3345 | 78.7% |
| Value | Count | Frequency (%) | |
| 0 | 2 | < 0.1% | |
| 30 | 1 | < 0.1% | |
| 34 | 1 | < 0.1% | |
| 35 | 1 | < 0.1% | |
| 36 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 165 | 1 | < 0.1% | |
| 160 | 2 | < 0.1% | |
| 158 | 2 | < 0.1% | |
| 157 | 2 | < 0.1% | |
| 156 | 3 | 0.1% |
| Distinct count | 1843 |
|---|---|
| Unique (%) | 43.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 30.644682352941174 |
|---|---|
| Minimum | 0.0 |
| Maximum | 59.76 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 15.5735 |
| Q1 | 24.365 |
| median | 30.68 |
| Q3 | 36.75 |
| 95-th percentile | 46.081 |
| Maximum | 59.76 |
| Range | 59.76 |
| Interquartile range (IQR) | 12.385 |
Descriptive statistics
| Standard deviation | 9.182096033 |
|---|---|
| Coefficient of variation (CV) | 0.2996309744 |
| Kurtosis | -0.0565844345 |
| Mean | 30.64468235 |
| Median Absolute Deviation (MAD) | 6.225 |
| Skewness | -0.006912526228 |
| Sum | 130239.9 |
| Variance | 84.31088755 |
| Value | Count | Frequency (%) | |
| 32.18 | 10 | 0.2% | |
| 30.6 | 9 | 0.2% | |
| 30.11 | 8 | 0.2% | |
| 31.37 | 8 | 0.2% | |
| 26.18 | 8 | 0.2% | |
| 31.45 | 7 | 0.2% | |
| 34.58 | 7 | 0.2% | |
| 29.58 | 7 | 0.2% | |
| 28.63 | 7 | 0.2% | |
| 28.66 | 7 | 0.2% | |
| Other values (1833) | 4172 | 98.2% |
| Value | Count | Frequency (%) | |
| 0 | 2 | < 0.1% | |
| 0.44 | 1 | < 0.1% | |
| 1.12 | 1 | < 0.1% | |
| 1.22 | 1 | < 0.1% | |
| 1.33 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 59.76 | 1 | < 0.1% | |
| 58.96 | 1 | < 0.1% | |
| 58.7 | 1 | < 0.1% | |
| 57.53 | 1 | < 0.1% | |
| 57.36 | 1 | < 0.1% |
| Distinct count | 1773 |
|---|---|
| Unique (%) | 41.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 200.17390588235293 |
|---|---|
| Minimum | 0.0 |
| Maximum | 359.3 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 118.2 |
| Q1 | 165.925 |
| median | 200.7 |
| Q3 | 233.775 |
| 95-th percentile | 282.71 |
| Maximum | 359.3 |
| Range | 359.3 |
| Interquartile range (IQR) | 67.85 |
Descriptive statistics
| Standard deviation | 50.24951818 |
|---|---|
| Coefficient of variation (CV) | 0.2510293135 |
| Kurtosis | 0.04345320215 |
| Mean | 200.1739059 |
| Median Absolute Deviation (MAD) | 33.7 |
| Skewness | -0.03041458624 |
| Sum | 850739.1 |
| Variance | 2525.014078 |
| Value | Count | Frequency (%) | |
| 230.9 | 10 | 0.2% | |
| 187.5 | 9 | 0.2% | |
| 194 | 9 | 0.2% | |
| 169.9 | 9 | 0.2% | |
| 199.7 | 9 | 0.2% | |
| 201 | 8 | 0.2% | |
| 216.5 | 8 | 0.2% | |
| 223.5 | 8 | 0.2% | |
| 209.4 | 8 | 0.2% | |
| 211.5 | 8 | 0.2% | |
| Other values (1763) | 4164 | 98.0% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 22.3 | 1 | < 0.1% | |
| 37.8 | 1 | < 0.1% | |
| 41.7 | 1 | < 0.1% | |
| 42.2 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 359.3 | 1 | < 0.1% | |
| 352.1 | 1 | < 0.1% | |
| 351.6 | 1 | < 0.1% | |
| 349.4 | 1 | < 0.1% | |
| 348.5 | 1 | < 0.1% |
total_eve_calls
Real number (ℝ≥0)
| Distinct count | 123 |
|---|---|
| Unique (%) | 2.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100.17647058823529 |
|---|---|
| Minimum | 0 |
| Maximum | 170 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 67 |
| Q1 | 87 |
| median | 100 |
| Q3 | 114 |
| 95-th percentile | 133 |
| Maximum | 170 |
| Range | 170 |
| Interquartile range (IQR) | 27 |
Descriptive statistics
| Standard deviation | 19.9085911 |
|---|---|
| Coefficient of variation (CV) | 0.1987352019 |
| Kurtosis | 0.1145997215 |
| Mean | 100.1764706 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -0.02081182363 |
| Sum | 425750 |
| Variance | 396.3519998 |
| Value | Count | Frequency (%) | |
| 105 | 98 | 2.3% | |
| 103 | 96 | 2.3% | |
| 91 | 95 | 2.2% | |
| 97 | 91 | 2.1% | |
| 94 | 88 | 2.1% | |
| 96 | 88 | 2.1% | |
| 108 | 88 | 2.1% | |
| 88 | 87 | 2.0% | |
| 101 | 86 | 2.0% | |
| 104 | 85 | 2.0% | |
| Other values (113) | 3348 | 78.8% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 12 | 1 | < 0.1% | |
| 36 | 1 | < 0.1% | |
| 38 | 1 | < 0.1% | |
| 43 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 170 | 1 | < 0.1% | |
| 169 | 1 | < 0.1% | |
| 168 | 1 | < 0.1% | |
| 159 | 1 | < 0.1% | |
| 157 | 1 | < 0.1% |
| Distinct count | 1572 |
|---|---|
| Unique (%) | 37.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.015011764705886 |
|---|---|
| Minimum | 0.0 |
| Maximum | 30.54 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10.05 |
| Q1 | 14.1025 |
| median | 17.06 |
| Q3 | 19.8675 |
| 95-th percentile | 24.031 |
| Maximum | 30.54 |
| Range | 30.54 |
| Interquartile range (IQR) | 5.765 |
Descriptive statistics
| Standard deviation | 4.271211992 |
|---|---|
| Coefficient of variation (CV) | 0.2510260969 |
| Kurtosis | 0.04332949445 |
| Mean | 17.01501176 |
| Median Absolute Deviation (MAD) | 2.86 |
| Skewness | -0.03038789084 |
| Sum | 72313.8 |
| Variance | 18.24325188 |
| Value | Count | Frequency (%) | |
| 16.12 | 13 | 0.3% | |
| 18.79 | 13 | 0.3% | |
| 14.25 | 13 | 0.3% | |
| 16.97 | 12 | 0.3% | |
| 15.9 | 12 | 0.3% | |
| 18.96 | 11 | 0.3% | |
| 16.8 | 10 | 0.2% | |
| 19.63 | 10 | 0.2% | |
| 17.09 | 10 | 0.2% | |
| 16.41 | 9 | 0.2% | |
| Other values (1562) | 4137 | 97.3% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1.9 | 1 | < 0.1% | |
| 3.21 | 1 | < 0.1% | |
| 3.54 | 1 | < 0.1% | |
| 3.59 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 30.54 | 1 | < 0.1% | |
| 29.93 | 1 | < 0.1% | |
| 29.89 | 1 | < 0.1% | |
| 29.7 | 1 | < 0.1% | |
| 29.62 | 1 | < 0.1% |
| Distinct count | 1757 |
|---|---|
| Unique (%) | 41.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 200.52788235294116 |
|---|---|
| Minimum | 0.0 |
| Maximum | 395.0 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 118.09 |
| Q1 | 167.225 |
| median | 200.45 |
| Q3 | 234.7 |
| 95-th percentile | 282.71 |
| Maximum | 395 |
| Range | 395 |
| Interquartile range (IQR) | 67.475 |
Descriptive statistics
| Standard deviation | 50.35354807 |
|---|---|
| Coefficient of variation (CV) | 0.251104971 |
| Kurtosis | 0.1148535776 |
| Mean | 200.5278824 |
| Median Absolute Deviation (MAD) | 33.55 |
| Skewness | 0.008490819348 |
| Sum | 852243.5 |
| Variance | 2535.479804 |
| Value | Count | Frequency (%) | |
| 186.2 | 11 | 0.3% | |
| 208.9 | 10 | 0.2% | |
| 188.2 | 8 | 0.2% | |
| 169.4 | 8 | 0.2% | |
| 193.6 | 8 | 0.2% | |
| 230.1 | 8 | 0.2% | |
| 190.5 | 8 | 0.2% | |
| 228.1 | 8 | 0.2% | |
| 214.7 | 8 | 0.2% | |
| 214 | 8 | 0.2% | |
| Other values (1747) | 4165 | 98.0% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 23.2 | 1 | < 0.1% | |
| 43.7 | 1 | < 0.1% | |
| 45 | 1 | < 0.1% | |
| 46.7 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 395 | 1 | < 0.1% | |
| 381.9 | 1 | < 0.1% | |
| 381.6 | 1 | < 0.1% | |
| 377.5 | 1 | < 0.1% | |
| 367.7 | 1 | < 0.1% |
total_night_calls
Real number (ℝ≥0)
| Distinct count | 128 |
|---|---|
| Unique (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 99.8395294117647 |
|---|---|
| Minimum | 0 |
| Maximum | 175 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 67 |
| Q1 | 86 |
| median | 100 |
| Q3 | 113 |
| 95-th percentile | 132 |
| Maximum | 175 |
| Range | 175 |
| Interquartile range (IQR) | 27 |
Descriptive statistics
| Standard deviation | 20.09321979 |
|---|---|
| Coefficient of variation (CV) | 0.2012551532 |
| Kurtosis | 0.07721835856 |
| Mean | 99.83952941 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.005273110227 |
| Sum | 424318 |
| Variance | 403.7374815 |
| Value | Count | Frequency (%) | |
| 105 | 100 | 2.4% | |
| 99 | 92 | 2.2% | |
| 95 | 91 | 2.1% | |
| 102 | 90 | 2.1% | |
| 91 | 88 | 2.1% | |
| 94 | 88 | 2.1% | |
| 104 | 87 | 2.0% | |
| 98 | 87 | 2.0% | |
| 100 | 86 | 2.0% | |
| 109 | 85 | 2.0% | |
| Other values (118) | 3356 | 79.0% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 33 | 1 | < 0.1% | |
| 36 | 1 | < 0.1% | |
| 38 | 2 | < 0.1% | |
| 40 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 175 | 1 | < 0.1% | |
| 170 | 1 | < 0.1% | |
| 165 | 1 | < 0.1% | |
| 164 | 1 | < 0.1% | |
| 161 | 1 | < 0.1% |
| Distinct count | 992 |
|---|---|
| Unique (%) | 23.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.023891764705883 |
|---|---|
| Minimum | 0.0 |
| Maximum | 17.77 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5.3145 |
| Q1 | 7.5225 |
| median | 9.02 |
| Q3 | 10.56 |
| 95-th percentile | 12.7255 |
| Maximum | 17.77 |
| Range | 17.77 |
| Interquartile range (IQR) | 3.0375 |
Descriptive statistics
| Standard deviation | 2.265921811 |
|---|---|
| Coefficient of variation (CV) | 0.2511025033 |
| Kurtosis | 0.1148651735 |
| Mean | 9.023891765 |
| Median Absolute Deviation (MAD) | 1.51 |
| Skewness | 0.008444754041 |
| Sum | 38351.54 |
| Variance | 5.134401655 |
| Value | Count | Frequency (%) | |
| 9.4 | 18 | 0.4% | |
| 9.63 | 17 | 0.4% | |
| 8.15 | 17 | 0.4% | |
| 10.8 | 17 | 0.4% | |
| 9.66 | 16 | 0.4% | |
| 8.82 | 15 | 0.4% | |
| 10.49 | 15 | 0.4% | |
| 9.76 | 15 | 0.4% | |
| 8.57 | 14 | 0.3% | |
| 10.35 | 14 | 0.3% | |
| Other values (982) | 4092 | 96.3% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1.04 | 1 | < 0.1% | |
| 1.97 | 1 | < 0.1% | |
| 2.03 | 1 | < 0.1% | |
| 2.1 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 17.77 | 1 | < 0.1% | |
| 17.19 | 1 | < 0.1% | |
| 17.17 | 1 | < 0.1% | |
| 16.99 | 1 | < 0.1% | |
| 16.55 | 1 | < 0.1% |
| Distinct count | 168 |
|---|---|
| Unique (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.256070588235294 |
|---|---|
| Minimum | 0.0 |
| Maximum | 20.0 |
| Zeros | 22 |
| Zeros (%) | 0.5% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5.7 |
| Q1 | 8.5 |
| median | 10.3 |
| Q3 | 12 |
| 95-th percentile | 14.6 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 3.5 |
Descriptive statistics
| Standard deviation | 2.760101726 |
|---|---|
| Coefficient of variation (CV) | 0.2691188309 |
| Kurtosis | 0.7029511928 |
| Mean | 10.25607059 |
| Median Absolute Deviation (MAD) | 1.8 |
| Skewness | -0.2413595394 |
| Sum | 43588.3 |
| Variance | 7.618161539 |
| Value | Count | Frequency (%) | |
| 11.1 | 75 | 1.8% | |
| 9.8 | 73 | 1.7% | |
| 11.4 | 73 | 1.7% | |
| 10.2 | 72 | 1.7% | |
| 10.9 | 71 | 1.7% | |
| 11.3 | 70 | 1.6% | |
| 10.1 | 69 | 1.6% | |
| 9.7 | 68 | 1.6% | |
| 9.5 | 66 | 1.6% | |
| 10.5 | 66 | 1.6% | |
| Other values (158) | 3547 | 83.5% |
| Value | Count | Frequency (%) | |
| 0 | 22 | 0.5% | |
| 0.4 | 1 | < 0.1% | |
| 1.1 | 2 | < 0.1% | |
| 1.3 | 1 | < 0.1% | |
| 2 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 20 | 1 | < 0.1% | |
| 19.7 | 2 | < 0.1% | |
| 19.3 | 1 | < 0.1% | |
| 19.2 | 1 | < 0.1% | |
| 18.9 | 1 | < 0.1% |
total_intl_calls
Real number (ℝ≥0)
| Distinct count | 21 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.426352941176471 |
|---|---|
| Minimum | 0 |
| Maximum | 20 |
| Zeros | 22 |
| Zeros (%) | 0.5% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 9 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.463069113 |
|---|---|
| Coefficient of variation (CV) | 0.5564556522 |
| Kurtosis | 3.263227525 |
| Mean | 4.426352941 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.360122209 |
| Sum | 18812 |
| Variance | 6.066709454 |
| Value | Count | Frequency (%) | |
| 3 | 847 | 19.9% | |
| 4 | 795 | 18.7% | |
| 2 | 644 | 15.2% | |
| 5 | 598 | 14.1% | |
| 6 | 408 | 9.6% | |
| 7 | 272 | 6.4% | |
| 1 | 226 | 5.3% | |
| 8 | 153 | 3.6% | |
| 9 | 126 | 3.0% | |
| 10 | 59 | 1.4% | |
| Other values (11) | 122 | 2.9% |
| Value | Count | Frequency (%) | |
| 0 | 22 | 0.5% | |
| 1 | 226 | 5.3% | |
| 2 | 644 | 15.2% | |
| 3 | 847 | 19.9% | |
| 4 | 795 | 18.7% |
| Value | Count | Frequency (%) | |
| 20 | 1 | < 0.1% | |
| 19 | 1 | < 0.1% | |
| 18 | 4 | 0.1% | |
| 17 | 1 | < 0.1% | |
| 16 | 7 | 0.2% |
| Distinct count | 168 |
|---|---|
| Unique (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.7696541176470584 |
|---|---|
| Minimum | 0.0 |
| Maximum | 5.4 |
| Zeros | 22 |
| Zeros (%) | 0.5% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1.54 |
| Q1 | 2.3 |
| median | 2.78 |
| Q3 | 3.24 |
| 95-th percentile | 3.94 |
| Maximum | 5.4 |
| Range | 5.4 |
| Interquartile range (IQR) | 0.94 |
Descriptive statistics
| Standard deviation | 0.7452041364 |
|---|---|
| Coefficient of variation (CV) | 0.2690603609 |
| Kurtosis | 0.7033212689 |
| Mean | 2.769654118 |
| Median Absolute Deviation (MAD) | 0.48 |
| Skewness | -0.2416706661 |
| Sum | 11771.03 |
| Variance | 0.5553292049 |
| Value | Count | Frequency (%) | |
| 3 | 75 | 1.8% | |
| 3.08 | 73 | 1.7% | |
| 2.65 | 73 | 1.7% | |
| 2.75 | 72 | 1.7% | |
| 2.94 | 71 | 1.7% | |
| 3.05 | 70 | 1.6% | |
| 2.73 | 69 | 1.6% | |
| 2.62 | 68 | 1.6% | |
| 2.84 | 66 | 1.6% | |
| 2.57 | 66 | 1.6% | |
| Other values (158) | 3547 | 83.5% |
| Value | Count | Frequency (%) | |
| 0 | 22 | 0.5% | |
| 0.11 | 1 | < 0.1% | |
| 0.3 | 2 | < 0.1% | |
| 0.35 | 1 | < 0.1% | |
| 0.54 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5.4 | 1 | < 0.1% | |
| 5.32 | 2 | < 0.1% | |
| 5.21 | 1 | < 0.1% | |
| 5.18 | 1 | < 0.1% | |
| 5.1 | 1 | < 0.1% |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5590588235294118 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 886 |
| Zeros (%) | 20.8% |
| Memory size | 33.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.31143353 |
|---|---|
| Coefficient of variation (CV) | 0.8411700126 |
| Kurtosis | 1.655618759 |
| Mean | 1.559058824 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.082691586 |
| Sum | 6626 |
| Variance | 1.719857904 |
| Value | Count | Frequency (%) | |
| 1 | 1524 | 35.9% | |
| 2 | 947 | 22.3% | |
| 0 | 886 | 20.8% | |
| 3 | 558 | 13.1% | |
| 4 | 209 | 4.9% | |
| 5 | 81 | 1.9% | |
| 6 | 28 | 0.7% | |
| 7 | 13 | 0.3% | |
| 9 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 886 | 20.8% | |
| 1 | 1524 | 35.9% | |
| 2 | 947 | 22.3% | |
| 3 | 558 | 13.1% | |
| 4 | 209 | 4.9% |
| Value | Count | Frequency (%) | |
| 9 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 7 | 13 | 0.3% | |
| 6 | 28 | 0.7% | |
| 5 | 81 | 1.9% |
churn
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 33.3 KiB |
| no | |
|---|---|
| yes | 598 |
| Value | Count | Frequency (%) | |
| no | 3652 | 85.9% | |
| yes | 598 | 14.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| state | account_length | area_code | international_plan | voice_mail_plan | number_vmail_messages | total_day_minutes | total_day_calls | total_day_charge | total_eve_minutes | total_eve_calls | total_eve_charge | total_night_minutes | total_night_calls | total_night_charge | total_intl_minutes | total_intl_calls | total_intl_charge | number_customer_service_calls | churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | OH | 107 | area_code_415 | no | yes | 26 | 161.6 | 123 | 27.47 | 195.5 | 103 | 16.62 | 254.4 | 103 | 11.45 | 13.7 | 3 | 3.70 | 1 | no |
| 1 | NJ | 137 | area_code_415 | no | no | 0 | 243.4 | 114 | 41.38 | 121.2 | 110 | 10.30 | 162.6 | 104 | 7.32 | 12.2 | 5 | 3.29 | 0 | no |
| 2 | OH | 84 | area_code_408 | yes | no | 0 | 299.4 | 71 | 50.90 | 61.9 | 88 | 5.26 | 196.9 | 89 | 8.86 | 6.6 | 7 | 1.78 | 2 | no |
| 3 | OK | 75 | area_code_415 | yes | no | 0 | 166.7 | 113 | 28.34 | 148.3 | 122 | 12.61 | 186.9 | 121 | 8.41 | 10.1 | 3 | 2.73 | 3 | no |
| 4 | MA | 121 | area_code_510 | no | yes | 24 | 218.2 | 88 | 37.09 | 348.5 | 108 | 29.62 | 212.6 | 118 | 9.57 | 7.5 | 7 | 2.03 | 3 | no |
| 5 | MO | 147 | area_code_415 | yes | no | 0 | 157.0 | 79 | 26.69 | 103.1 | 94 | 8.76 | 211.8 | 96 | 9.53 | 7.1 | 6 | 1.92 | 0 | no |
| 6 | LA | 117 | area_code_408 | no | no | 0 | 184.5 | 97 | 31.37 | 351.6 | 80 | 29.89 | 215.8 | 90 | 9.71 | 8.7 | 4 | 2.35 | 1 | no |
| 7 | WV | 141 | area_code_415 | yes | yes | 37 | 258.6 | 84 | 43.96 | 222.0 | 111 | 18.87 | 326.4 | 97 | 14.69 | 11.2 | 5 | 3.02 | 0 | no |
| 8 | IN | 65 | area_code_415 | no | no | 0 | 129.1 | 137 | 21.95 | 228.5 | 83 | 19.42 | 208.8 | 111 | 9.40 | 12.7 | 6 | 3.43 | 4 | yes |
| 9 | RI | 74 | area_code_415 | no | no | 0 | 187.7 | 127 | 31.91 | 163.4 | 148 | 13.89 | 196.0 | 94 | 8.82 | 9.1 | 5 | 2.46 | 0 | no |
Last rows
| state | account_length | area_code | international_plan | voice_mail_plan | number_vmail_messages | total_day_minutes | total_day_calls | total_day_charge | total_eve_minutes | total_eve_calls | total_eve_charge | total_night_minutes | total_night_calls | total_night_charge | total_intl_minutes | total_intl_calls | total_intl_charge | number_customer_service_calls | churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4240 | AR | 127 | area_code_415 | no | yes | 27 | 157.6 | 107 | 26.79 | 280.6 | 49 | 23.85 | 75.1 | 77 | 3.38 | 8.0 | 4 | 2.16 | 1 | no |
| 4241 | WA | 80 | area_code_510 | no | no | 0 | 157.0 | 101 | 26.69 | 208.8 | 127 | 17.75 | 113.3 | 109 | 5.10 | 16.2 | 2 | 4.37 | 2 | no |
| 4242 | MN | 150 | area_code_408 | no | no | 0 | 170.0 | 115 | 28.90 | 162.7 | 138 | 13.83 | 267.2 | 77 | 12.02 | 8.3 | 2 | 2.24 | 0 | no |
| 4243 | ND | 140 | area_code_510 | no | no | 0 | 244.7 | 115 | 41.60 | 258.6 | 101 | 21.98 | 231.3 | 112 | 10.41 | 7.5 | 6 | 2.03 | 1 | yes |
| 4244 | AZ | 97 | area_code_510 | no | no | 0 | 252.6 | 89 | 42.94 | 340.3 | 91 | 28.93 | 256.5 | 67 | 11.54 | 8.8 | 5 | 2.38 | 1 | yes |
| 4245 | MT | 83 | area_code_415 | no | no | 0 | 188.3 | 70 | 32.01 | 243.8 | 88 | 20.72 | 213.7 | 79 | 9.62 | 10.3 | 6 | 2.78 | 0 | no |
| 4246 | WV | 73 | area_code_408 | no | no | 0 | 177.9 | 89 | 30.24 | 131.2 | 82 | 11.15 | 186.2 | 89 | 8.38 | 11.5 | 6 | 3.11 | 3 | no |
| 4247 | NC | 75 | area_code_408 | no | no | 0 | 170.7 | 101 | 29.02 | 193.1 | 126 | 16.41 | 129.1 | 104 | 5.81 | 6.9 | 7 | 1.86 | 1 | no |
| 4248 | HI | 50 | area_code_408 | no | yes | 40 | 235.7 | 127 | 40.07 | 223.0 | 126 | 18.96 | 297.5 | 116 | 13.39 | 9.9 | 5 | 2.67 | 2 | no |
| 4249 | VT | 86 | area_code_415 | no | yes | 34 | 129.4 | 102 | 22.00 | 267.1 | 104 | 22.70 | 154.8 | 100 | 6.97 | 9.3 | 16 | 2.51 | 0 | no |